A novel method of data analysis for hadronic physics 
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A novel method for extracting physical parameters from experimental and simulation data is 
presented. The method is based on statistical concepts and it relies on Monte Carlo simulation 
techniques. It identifies and determines with maximal precision parameters that are sensitive to the 
data. The method has been extensively studied and it is shown to produce unbiased results. It is 
applicable to a wide range of scientific and engineering problems. It has been successfully applied 
in the analysis of experimental data in hadronic physics and of lattice QCD correlators. 
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A principal task in experimental and computational 
physics concerns the determination of the parameters of 
a theory (model) from experimental or simulation data. 
Examples are abundant: the determination of the pa- 
rameters of the Standard Model in Particle Physics, the 
multipole amplitudes in Nucleon Resonance excitation in 
hadronic physics, the determination of the spectrum from 
correlators in lattice QCD calculations, the parameters of 
acoustic resonances in Cosmology, to mention a few. 

The data from which information on parameters of the 
theory is to be extracted are characterized by statistical 
uncertainties and systematic errors, are typically of lim- 
ited dynamic range and sensitive only to a few of the 
model parameters, rendering this task difficult and often 
intractable. Identifying which parameters of the model 
can be determined from the available data is often dif- 
ficult to prejudge and their extraction without bias is 
often impossible. Particularly hard is the determination 
of the systematic and model uncertainties that ought to 
be assigned to the extracted values of the parameters. 

We address this problem via a method, that we will re- 
fer to as the Athens Model Independent Analysis Scheme, 
" AMIAS" , which is capable of extracting theory (model) 
parameters and their uncertainties from a set of data in 
a rigorous, precise, and unbiased way. The methodology 
is first presented and then subsequently applied to two 
problems in hadronic physics, which are used as demon- 
stration cases. The two cases concern A) the extraction 
of the mass spectrum of hadrons from Euclidean time 
correlators in lattice QCD simulations and B) the ex- 
traction of the multipole excitation amplitudes for the 
Nucleon resonances and in particular that of the first ex- 
cited state of the Nucleon, the A(1232) resonance. 

The AMIAS method is applicable to problems in which 
the parameters to be determined are linked in an explicit 
way to the data through a theory or model. There is no 
requirement that this set of parameters are orthogonal; 
they can be subjected to constraints, e.g. by requiring 
that unitarity is satisfied. The method requires that a 



quantitative criterion for the "goodness" of a solution is 
chosen and thus far we have employed the \ 2 criterion. 

For a given theory any set of values for its parameters, 
satisfying its symmetries and constraints, provides a so- 
lution having a finite probability of representing reality. 
This probability can be quantified through a compari- 
son to the data being analyzed. Based on these concepts 
AMIAS can be formulated as follows: 

A set of parameters A±,A2,...,Ajf = {A„} which com- 
pletely and explicitly describes a process within a 
theory, can be determined from a data set { Vk ± 
Sk}, produced by this process, by noting that any 
arbitrary set of values {a„p for these parameters 
constitutes a solution having a probability P(j) of 
representing "reality" which is equal to: 



P(j) = G[ X 2 (j),M j ] 



(1) 



where G is a function of the data and the parame- 
ters of the model and of \ 2 , where, 



x 2 (j) = T,{— 



(2) 



Thus P(j) is a function of the x 2 resulting from the 
comparison to the data {Vk ± £fc} of the predicted, 
by the theory, values U 3 k by the {a„p solution. 

In the case where we chose G = e~ x I" 1 the results 
obtained by AMIAS are related to those obtained by \ 2 
minimization methods and widely used and implemented 
in a number of codes (e.g. MINUET Q). The results 
become identical if correlations among the parameters of 
the theory are absent or ignored. 

We call an ensemble Z of such a{, solutions Canonical 
Ensemble of Solutions, which has properties that depend 
only on the experimental data set. Similarly a Micro- 
canonical Ensemble of Solutions can be defined as the 
collection of solutions which are characterized by 
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where x\ ano - Xb define a sufficiently narrow range in x 2 
space. A case of particular interest concerns the micro- 
canonical ensemble near the minimum x 2 value: 

X 2 < Xlnn + C (4) 

where C is usually taken to be the constant equal to the 
effective degrees of freedom of the problem. 

The extraction of the model parameters {A U ±SA^} for 
a specific set of data can be accomplished by employing 
the following procedure: 

• A canonical ensemble of solutions, is being con- 
structed by randomly choosing values, {a^} 3 , for 
the set of parameters {A u } of the theory within 
the allowed physical limits and by imposing the re- 
quired constraints. Each set {a u } 3 constitutes a 
point in the ensemble which is labeled by the x 2 
value this solution generates when compared with 
the data. In the absence of constraints, any given 
model parameter A v will assume all allowed values 
with equal probability (equipartition postulate). 

• To each point of the ensemble {ay} 3 a probability 
is assigned, equal to P{j). Following standard sta- 
tistical concepts, the probability n(a„) of a param- 
eter A v assuming a specific value a v in the range 
(a v , a v + Aa„) is equal to: 

/ EjdAlPti) 



I ZjdAiP(j) 

— oo 

This expression defines the Probability Distribu- 
tion Function (PDF) of any parameter of the the- 
ory for representing "reality". It thus contains the 
maximum information that can be obtained from 
the given set of data. Having obtained the PDF, 
numerical results can be derived, usually moments 
of the distribution. The mean value is normally 
identified as the "solution" and the corresponding 
variance as its "uncertainty". 

It is manifestly obvious that AMIAS has minimal as- 
sumptions and that it introduces no methodological bias 
to the solution. It determines the theory/model param- 
eters that exhibit sensitivity to the data yielding PDFs 
that allow only a restricted range of values usually with 
a well defined maximum and a narrow width. If the data 
do not contain physical information to determine some of 
the parameters, then the resulting PDFs are featureless. 
The underlying stochastic approach allows easy scalabil- 
ity to a very large number of parameters even with lim- 
ited number of data. 

The method is computationally robust and stable; ob- 
viously it could not have been implemented without the 
advent of powerful computers. We have developed al- 
gorithms that can be implemented efficiently and pro- 
duce results to realistic and demanding problems within 



reasonable computation times. It is also apparent that 
the method is amenable to trivial computational paral- 
lelism. Description of its algorithmic implementation, is 
beyond the scope of this paper and it will be presented 
elsewhere [2[. 

A validation of the method and demonstration of its ca- 
pabilities has been extensively studied in a number of toy 
models and in two physical problems in hadronic physics 
both of current interest: A) The extraction from lattice 
simulation data of the masses of the spectrum of mesons 
and baryons and B) The extraction of multipole ampli- 
tude strength from nucleon electroexcitation spectra. In 
all cases studied the AMIAS method recovered with the 
expected statical accuracy the input parameters in the 
case of pseudodata. The derived uncertainties are com- 
patible with those obtained using the "jackknife" tech- 
nique Q in the case of toy models and the lattice data. 

We present below results with pseudodata to demon- 
strate the validity for the two cases mentioned above; 
analysis of physical data or simulations corresponding to 
these case have been published elsewhere. 

A. Extraction of Mass Spectrum of Baryons 
from Euclidean Time Correlators 

There has been impressive progress in lattice QCD 
calculations where new algorithms and faster comput- 
ers make feasible high-precision simulations close to the 
physical parameters j4j. As in the case of experimen- 
tal data, simulation data are characterized by statisti- 
cal uncertainties and systematic error. To extract physi- 
cal quantities from lattice simulations such as masses of 
hadrons, decay constants and form factors, fits to the 
simulated data are performed. As in other fields, various 
approaches have been explored in order to extract the 
physics of interest [f| @ . 

We have successfully applied the AMIAS method to 
the analysis of two-point correlators which result from 
calculations in Lattice QCD [7]. Its generalization to 
more complicated objects such as three-point correlators 
is under study. Results extracted from lattice QCD simu- 
lations have been presented and compared to traditional 
methods elsewhere 0. 

In Lattice Gauge theories, the Euclidean time corre- 
lator C(t) of an interpolating operator J(x, t) and its 
spectral decomposition for zero three-momentum is: 

oo 

C(t) = £ < J(x,t)jt (0,0) >= Y, Ke- mnt (6) 

x n— 

where the brackets denote the vacuum expectation value. 
The exponential dependence is correct for Dirichlet 
boundary conditions. In the large t limit the state with 
the lowest mass (ground state) dominates the time de- 
pendence of the correlator. Fitting the asymptotic be- 
havior of m e g(t) = —log{C(t)/C(t + 1)} to a constant 
yields the lower mass of the hadron while determination 
of higher masses gives the excitation energies of states of 
the same quantum numbers as the ground state. 



3 



The case of lattice QCD simulations present an ex- 
cellent case for AMI AS. As required, a framework that 
connects the data and the model parameters of interest, 
the masses rrij and the overlap amplitudes Cj , is explicit 
and in this case is given by Eq. [6] 

We present here a simple case employing pseudodata so 
as to demonstrate the validity and some features of the 
method. Pseudodata were generated for the a system 
for a theory with f(t) — Coexp(— m i) + Cj exp(— m^) 
and relative errors that grow with time resembling lattice 
data. We have arbitrarily chosen Co = 1.0, tuq = 0.500, 
C\ = 3.0 and mi = 1.00. We have demonstrated that 
the extracted values have a precise meaning through the 
analysis of pseudo-data generated with predetermined 
statistical accuracy. 

The extracted results are statistically compatible irre- 
spective of whether they were derived by taking n = 2 or 
ri = 3 or n > 3. The uncertainty of the fitted parameters 
grows as the number of the (a priory unknown) terms 
fitted is increased. We adopted an ansatz whereby the 
number of terms employed is greater by one to those that 
can be extracted with finite uncertainty. Similarly the 
size of the phase volume that the Monte-Carlo method 
is sampling does not affect the solution provided that 
the volume is sufficiently large to include all "good so- 
lutions". By "good solutions" we denote solutions with 
small or reasonable x 2 / (degrees of freedom). As shown 
in Fig. [T] the parameters are accurately extracted, in 
complete agreement with the generator values within the 
stated statistical accuracy. As expected, search for M2 
yields a null result. 

AMIAS has been used to analyze lattice simulation 
data and the derived results [7j that compare favorably 
to those derived by traditional methods considered as 
defining the "state of the art" in the lattice commu- 
nity @. 

B. Extraction of Multipole Amplitudes from 
electroexcitation Spectra 

The problem of extracting multipole amplitudes from 
electroexcitation spectra with reduced model uncertainty 
motivated the work that is reported here. In particular, 
the verification of the conjecture that hadrons are non 
spherical [T3, EH has been demanding the isolation with 
high precision of the small resonant quadrupole ampli- 
tudes in the N — > A transition. It was observed that 
increased accuracy in the experimental data would not 
yield more precise results [10(, which were inherently 
limited by the limitations of the analysis methods em- 
ployed. This important case provides a typical problem of 
high complexity, amenable to being solved by the AMIAS 
method. 

The parameters of the model {^l, are the multipole 

1/2 3/2 

amplitudes such as the M L ± , M L '± , using standard spec- 
troscopic notation. They relate to the data, cross sections 
and polarization observables in electroexcitation experi- 
ments, through the CGLN formulation of the resonance 
electroexcitation [l2j]. They are infinite in number, so 
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FIG. 1: AMIAS generated Probability Distribution Functions 
(PDFs) for the masses of the ground state and excited states 
of the nucleon; The masses used to generate the pseudodata 
are accurately reproduced. 



a truncation is needed and they are related to the data 
through a very complex convoluted scheme, unlike the 
case of masses in QCD lattice data. Furthermore, in this 
case, the parameters of the problem are subjected to the 
constraint of unitarization. 

The validity of the AMIAS method was demonstrated 
with the employment of pseudodata generated through 
the well established MAID scheme [13j (which imple- 
ments the CGLN formalism). AMIAS derived results 
have been demonstrated to have a precise meaning 
through the analysis of pseudodata which were generated 

with predetermined statistical accuracy. In the cases pre- 

l ii 

sented below, we have frozen the AJ and we have varied 
the A^J 2 helicity amplitudes. Few demonstrative cases of 
the pseudodata validation are presented below. 

We use pseudodata with kinematics of the Q 2 = 0.127 
(GeV/c) 2 Bates and Mainz N -> A data [3 to demon- 
strate the validity of the analysis. The data set is pub- 
lished, is well understood and it is well described by 
MAID. Two sets of pseudodata were generated using 
MAID, characterized by different statistical accuracy: 
"Set A" with statistical accuracy similar to that of the 
experimental values and "Set B" with statistical accu- 
racy hundred times better than that of the experimental 
values. These data were analyzed and the multipoles 
were extracted which are tabulated and compared with 
the generator values in the Table. We have tabulated 
only extracted values which are derived with uncertain- 
ties better than 100% for "Set A" . It can be seen that the 
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TABLE I: The multipole values extracted, in units of 
10 -3 /m„, from two pseudodata sets are compared to the gen- 
erator (modified MAID) values. They are shown to be entirely 
compatible with increasing precision in the extracted param- 
eters as the statistical accuracy of the data increases. 



Multipole 


Generator 


Set A 


Set B 


M 1+ 


27.248 


27.23 


± 


0.13 


27.249 


± 


0.001 


Lo+ 


3.500 


3.70 


± 


0.23 


3.502 


± 


0.002 


In+ 


1.048 


1.03 


± 


0.08 


1.048 


± 


0.001 


Ei+ 


1.481 


1.49 


± 


0.18 


1.482 


± 


0.002 


Eo+ 


4.225 


3.68 


± 


1.02 


4.239 


± 


0.013 


Mi_ 


4.119 


4.47 


± 


1.31 


4.124 


± 


0.013 


Li- 


1.205 


1.05 


± 


0.43 


1.203 


± 


0.008 


E 2 - 


1.024 


1.07 


± 


0.45 


1.027 


± 


0.006 


L2+ 


0.007 


0.02 


± 


0.01 


0.008 


± 


0.001 


E'2 + 


0.006 


0.01 


± 


0.01 


0.007 


± 


0.001 



0.08F 




M1+ L1 + 

FIG. 2: PDFs for the norms of some of the sensitive am- 
plitudes of the analyzed Bates/Mainz experimental data set. 
The distributions allow the determination of the central value 
and corresponding uncertainty for each of the multipoles. 



AMIAS extracted multipole values are in complete agree- 
ment with the generator values within the stated statisti- 
cal accuracy. Also, as required, the quoted uncertainties 
are reduced in set "B" (hundredfold), proportionally to 
the statistical accuracy of the pseudodata sets. For com- 
parison, in Fig. [5] the probability distributions is shown 
for the most sensitive amplitudes of the Bates/Mainz ex- 
perimental data set analyzed with AMIAS. 

To verify the ability of AMIAS to extract uncertainties 
which have precise statistical interpretation is generally 
more difficult. The scaling behavior exhibited by the 
two sets of pseudodata, discussed above, is a necessary 
but not sufficient condition. The definitive validation was 
achieved by introducing an arbitrary uncertainty, a " gen- 
erator uncertainty" to the nominal generator multipole 
values. Multiple sets of data generated by randomized 
input within the allowed uncertainties of the generator 
parameters are recovered by AMIAS. This demonstra- 
tion exercise was performed both for simple functional 
forms (e.g. polynomial functions) and complicated cases 
such as this one (multipole amplitudes in a CGLN for- 
malism), results of which have been presented in [l5j . 
Furthermore, in the case of polynomial functions and lat- 
tice QCD two-point functions, derived jackknife errors 
are found to be statistically compatible with AMIAS un- 
certainties. 

In summary: a novel method of analysis is shown to 
offer significant advantages over existing methods in de- 
termining physical parameters from experimental or sim- 
ulation data: it is computationally robust, it provides 
methodolody independent answers with maximal preci- 
sion in terms of the derived Probabilty Distribution Func- 
tion for each parameter. 

We thank Prof. E. Manousakis for enlightening discus- 
sions on the use of statistical theory and Prof. C. Alexan- 
drou for suggesting the use of AMIAS for lattice QCD 
gauge simulations. 
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